A Brief Discussion on Copilot CLI's Autopilot and YOLO Mode Mechanisms and Quota Pitfalls
TLDR
- YOLO mode automatically approves all high-risk actions (read/write, delete, terminal execution); please ensure your code is under version control.
- Autopilot mode enters an autonomous loop. If the model continues to attempt subsequent actions after a task is completed, it will lead to significant quota consumption.
- The billing mechanism for Autopilot is "each autonomous continuation step deducts one premium request," which differs from the billing logic of VS Code's
chat.agent.maxRequests. - It is recommended to use Autopilot only for complex tasks. If you are simply asking questions, avoid enabling this mode to prevent unnecessary quota depletion.
- You can use the
--max-autopilot-continuesparameter to limit the maximum number of autonomous executions and prevent infinite loops.
Features You Need to Know for Automated Execution
WARNING
Automated execution carries risks. Before running, ensure your code is under version control, and exercise caution if the task involves external interfaces or database connections.
YOLO Mode
When you might encounter this: When a user wants the AI to automatically execute all high-risk commands (such as rm -rf) without having to manually click through confirmation windows one by one.
YOLO (You Only Live Once) mode controls whether the system "auto-approves" all high-risk actions.
- How to enable:
- Add the parameter at startup:
gh copilot --allow-all(or the commonly used--yoloparameter in the community). - If the Copilot interface is already open, you can enter slash commands:
/yoloor/allow-all.
- Add the parameter at startup:
- Actual operation:
- Under normal circumstances, even if the AI decides the next step is to run
rm -rf, the system will still pop up a confirmation window by default. - After enabling YOLO, the aforementioned confirmations are bypassed silently.
- Under normal circumstances, even if the AI decides the next step is to run
Execution Modes
When you might encounter this: When a user needs to choose different AI interaction rhythms and levels of autonomy based on task complexity.
In the Copilot CLI interactive interface, you can cycle through the following three modes using Shift + Tab:
- Standard: The default interaction mode where the user provides instructions step-by-step, and the AI waits for the next input after responding.
- Plan: The AI first clarifies the requirements to confirm the scope, creates a structured implementation plan, and only executes after the plan is confirmed.
- Autopilot: The AI enters an autonomous loop, without waiting for user input at every step, until the task is completed, an error is encountered, the user manually presses Ctrl+C, or the continuation limit is reached.
Regarding the autonomous execution limits for Autopilot, the differences between Copilot CLI and VS Code settings are as follows:
--max-autopilot-continues | chat.agent.maxRequests | |
|---|---|---|
| Tool | Copilot CLI | VS Code |
| Target of limit | Number of autonomous continuations in Autopilot | Number of AI model call turns for the Agent |
| Billing timing | Each autonomous continuation step deducts one premium request | Only user-initiated prompts are billed |
| After reaching the limit | Execution stops immediately | Asks if you want to continue |
Autopilot Quota Pitfalls
When you might encounter this: When using Autopilot for simple Q&A, or when the model fails to correctly determine the termination condition after a task is completed, leading to an infinite loop.
The mechanism of Autopilot is: when it is time for user confirmation, if the user does not respond, it will reply on your behalf and continue execution. Each "reply on behalf" round-trip deducts from your quota.
Cause Analysis and Verification
GPT-related models often proactively ask whether to perform subsequent actions after a task is completed. If Autopilot is enabled, the model will directly reply on behalf of the user and trigger the next step. In scenarios involving lower-tier models or simple Q&A, the model may misjudge and repeatedly attempt to confirm from different angles, leading to multiple Continuing autonomously (0.33 premium requests) deduction records.
If you are using a high-cost model like Claude Opus, the cost of meaningless triggers will increase significantly when Autopilot fails to terminate properly.
References
- GitHub Issue #1532: Infinite loop issue in Autopilot mode
- GitHub Issue #1477: Discussion on subsequent requests consuming quota
Conclusion and Recommendations
- When your quota is sufficient and you are using a model with strong execution capabilities, consider enabling YOLO + Autopilot for autonomous optimization.
- In most scenarios, enabling only YOLO is sufficient to meet your needs; Autopilot is not necessarily required.
- If you are only asking questions rather than executing tasks, do not enable Autopilot, as it is highly likely to cause unnecessary quota consumption.
- Be sure to make good use of the
--max-autopilot-continuesparameter to set a limit for autonomous execution and prevent infinite loops from burning through your quota.
Changelog
- 2026-03-22 Initial document creation.
